Approximating High-Dimensional Range Queries with kNN Indexing Techniques

نویسندگان

  • Michael A. Schuh
  • Tim Wylie
  • Chang Liu
  • Rafal A. Angryk
چکیده

While k-nearest neighbor queries are becoming increasingly common due to mobile and geospatial applications, orthogonal range queries in high-dimensional data are extremely important in scientific and web-based applications. For efficient querying, data is typically stored in an index optimized for either kNN or range queries. This can be problematic when data is optimized for kNN retrieval and a user needs a range query or vice versa. Here, we address the issue of using a kNN-based index for range queries, as well as outline the general computational geometry problem of adapting these systems to range queries. We refer to these methods as space-based decompositions and provide a straightforward heuristic for this problem. Using iDistance as our applied kNN indexing technique, we also develop an optimal (data-based) algorithm designed specifically for its indexing scheme. We compare this method to the suggested näıve approach using real world datasets and results show that our data-based algorithm consistently performs better.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending High-Dimensional Indexing Techniques Pyramid and iMinMax(θ): Lessons Learned

Pyramid Technique and iMinMax(θ) are two popular highdimensional indexing approaches that map points in a high-dimensional space to a single-dimensional index. In this work, we perform the first independent experimental evaluation of Pyramid Technique and iMinMax(θ), and discuss in detail promising extensions for testing k -Nearest Neighbor (kNN) and range queries. For datasets with skewed dist...

متن کامل

A Comprehensive Study of iDistance Partitioning Strategies for kNN Queries and High-Dimensional Data Indexing

Efficient database indexing and information retrieval tasks such as k -nearest neighbor (kNN) search still remain difficult challenges in large-scale and high-dimensional data. In this work, we perform the first comprehensive analysis of different partitioning strategies for the state-of-the-art high-dimensional indexing technique iDistance. This work greatly extends the discussion of why certa...

متن کامل

Constructing an Effective and Secure Query Services with Rsap Data Perturbation in the Cloud

Now a day’s cloud is more popular because in cloud users host the data and upload a large contained data. It has large databases to database service providers so database service providers maintain the services of range query services. In clouding process some users have a sensitive private data in that situation user’s can’t move the data for hosting until we provide security, confidentiality,...

متن کامل

Minimizing the Number of Keypoint Matching Queries for Object Retrieval

To increase the efficiency of interest-point based object retrieval, researchers have put remarkable research efforts into improving the efficiency of kNN-based feature matching, pursuing to match thousands of features against a database within fractions of a second. However, due to the high-dimensional nature of image features that reduces the effectivity of index structures (curse of dimensio...

متن کامل

Distributed computation of the knn graph for large high-dimensional point sets

High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014